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ABSTRACT 

I calculate the statistics of correlation of two digitized noiselike signals: drawn from complex 
Gaussian distributions, sampled, quantized, correlated, and averaged. Averaged over many such 
samples, the correlation r approaches a Gaussian distribution. The mean and variance of r 
fully characterize the distribution of r. The mean corresponds to the reproducible part of the 
measurement, and the variance corresponds to the random part, or noise. I investigate the 
case of non-negligible covariance p between the signals. Noise in the correlation can increase or 
decrease, depending on quantizer parameters, when p increases. This contrasts with correlation 
of continuously-valued or unquantized signals, for which the noise in phase with p increases with 
increasing p, and noise out of phase decreases. Indeed, for some quantizer parameters, I find that 
correlation of quantized signals provides a more accurate estimate of p than would correlation 
without quantization. I present analytic results in exact form and as polynomial expansions, and 
compare these mathematical results with results of computer simulations. 

Subject headings: methods: data analysis, statistical - techniques: interferomctric 



1. INTRODUCTION 

Nearly all signals from astrophysical sources can be represented as electric fields comprised of Gaussian 
noise. These noiselike signals have zero mean. All information about the source is contained in the variance 
of the electric field, and in covariances between different polarizations, positions, times, or frequency ranges. 
The intensity, for example, is simply the sum of the variances in 2 basis polarizations. More generally, all 
the Stokes parameters can be expressed in terms of the variances and covariances of of these 2 polarizations. 
Similarly, in interferometry, the covariance of electric fields at different positions is the visibility, the Fourier 
transform of source structure. In correlation spectroscopy, the covariances of electric field at different time 
separations, expressed as the autocorrelation function, are the Fourier transform of the spectrum. Because 
the signals are drawn from Gaussian distributions, their variances and covariances completely characterize 
them. The single known exception to this rule of Gaussian statistics is radiation from pulsars, under certain 
observing conditions (Jcnet, Anderson, & Prince 2001). 

Particularly at wavelengths of a millimeter or more, covariances are usually estimated by correlation. 
(Actually correlation is used at all wavelengths, but at wavelengths shortward of a millimeter quantum- 
mechanical processes come into the picture, complicating it). Correlation involves forming products of 
samples of the two signals. The average of many such products approximates the covariance. In mathematical 
terms, for two signals x and y, the covariance is p = {xy), where the angular brackets (..) represent a statistical 
average over an ensemble of all statistically-identical signals. Correlation approximates this enormously 
infinite average with a finite average over Nq samples of x and y: r^o = ^iVi- Here, the subscript 

"oo" reflects the fact that x and y are unquantized; their accuracy is not limited to a finite number of 
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quantized levels. The subscript "g" indicates sampling at the Nyquist rate, as I assume (see Thompson, 
Moran, & Swenson (1986)). 

Because the number of samples in most measurements of correlation is large, the results of a finite 
correlation follow a Gaussian distribution. This is a consequence of the central limit theorem. Thus, one 
expects a set of identical measurements of Too to be fully characterized by their mean, (roo) , and their standard 
deviation, a/ ((^oo — (''oo))^)- The mean is the deterministic part of the measurement; it provides an estimate 
of p. The standard deviation characterizes the random part of the measurement, and is often called "noise" 
(but is to be distinguished from the noiselike signals x and y that arc being correlated). In principle, the 
best measurement minimizes the random part, while preserving the relation In twccn the deterministic part 
and p. The signal-to- nose ratio of the correlation, TZoc = (?'oo)/-\/((^oc — (?'oc))^), provides a figure of merit 
that quantifies the relative sizes of deterministic and random parts; see, for example, Thompson, Moran, & 
Swenson (1986). 

The electric field is commonly digitized before correlation. Digitization includes sampling and quantiza- 
tion. Sampling involves averaging the signal over short time windows; it thus restricts the range of frequencies 
that can be uniquely represented. For simplicity, in this paper I will restrict discussion to "video" or "base- 
band" signals, for which a frequency range of up to some maximum frequency is present; and that they are 
sampled at the Nyquist rate, or at half the shortest period represented. I also assume that the signals are 
"white," in the sense that samples x, and yj are correlated only i{i= j; and that the signals are stationary, 
so that the correlation of Xi and yj is independent of i. These assumptions limit the influence of sampling. 
I will discuss spectrally- varying signals elsewhere (Gwinn 2003). 

Quantization limits the values that can be represented, so that the digitized signal imperfectly represents 
the actual signal. Quantization thus introduces changes in both the mean correlation (rjvf) and its standard 
deviation i/ {{tm — (?'m))^)- Here the subscript "M" represents the fact that the quantized signal can take 
on M discrete values. The mean and standard deviation of tm can be calculated from the statistics of the 
quantized signals Xi and jji and the details of the quantization scheme. 

A number of previous authors have addressed the effects of quantization on correlation of noiselike 
signals (see, for example, Thompson, Moran, & Swenson (1986), Chapter 8, and references therein). No- 
tably, Van Vleck & Middleton (1966) found the average correlation and the standard deviation for two-level 
quantization: the case where quantization reduces the signals to only their signs. Cooper (1970) found 
the average correlation and its standard deviation, for small normalized covariance p << 1, for four- level 
correlation: quantization reduces the signals to signs and whether they lie above or below some threshold 
vq. He found the optimal values of wq, and the relative weighting of points above and below the threshold 
n, as quantified by the signal-to-noise ratio 7^.4. Hagen & Farley (1973) generalized this to a broader range 
of quantization schemes, and studied effects of oversampling. Bowers & Klingler (1974) examined Gaussian 
noise and a signal of general form in the small-signal limit. They devise a criterion for the accuracy with 
which a quantization scheme represents a signal, and show that this yields the highest signal-to-noise ratio 
for p << 1. Most recently, Jenet & Anderson (1998) examined the case of many-level correlators. They use 
a criterion similar to that of Bowers & Klingler (1974) to calculate the optimal level locations for various 
numbers of levels. Jenet & Anderson (1998) also find the mean spectrum for a spectrally-varying source, 
as measiircd by an autocorrelation spectrometer; they find that quantization introduces a uniform offset to 
to the spectrum, and scales the spectrum by a factor, and calculate the offset and factor. D'Addario et al. 
(1984) provide an extensive analysis of errors in a 3-level correlator. 

In this paper, I calculate the average correlation and its standard deviation for nonvanishing covariance 
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p. I provide exact expressions for these quantities, and approximations valid through fourth order in p. 
Interestingly, noise actually declines for large p, for correlation of quantized signals. Indeed, the signal-to- 
noise ratio for larger p can actually exceed that for correlation of an unquantized signal. In other words, 
correlation of a quantized signal can provide a more accurate measure of p than would correlation of the 
signal before quantization. This fact is perhaps surprising; it reflects the fact that correlation is not always 
the most accurate way to determine the covariance of two signals. 

The organization of this paper is as follows: In section 2, I review the statistics of the correlation of 
unquantized, or continuously variable, complex signals. In section 3, I present expressions that give the 
average correlation and the standard deviation, in terms of integrals involving the characteristic curve. I 
include statistics of real and imaginary parts, which arc different. I present expansions of these integrals as a 
power series in p. In section 4 I discuss computer simulations of correlation to illustrate these mathematical 
results. I summarize the results in the final section. 



2. CORRELATION OF UNQUANTIZED VARIABLES 

2.1. Bivariate Gaussian Distribution 

Consider two random, complex signals x and y. Suppose that each of these signals is a random variable 
drawn from a Gaussian distribution. Suppose that the signals x and y are correlated, so that they are, in 
fact, drawn from a joint Gaussian joint probability density function P{x,y) (Meyer 1975). Without loss of 
generality, I assume that each signal has variance of 2: 

(xx*) = (yy*) = 2. (1) 

In this expression, the angular brackets (...) denote an statistical average: in other words, an average over 
all systems with the specified statistics. This choice for a variance of 2 for x and y is consistent with the 
literature on this subject, much of which treats real signals (rather than complex ones), drawn from Gaussian 
distributions with unit variance (Cooper 1970; Thompson, Moran, & Swenson 1986). Of course, the results 
presented here are easily scaled to other variances for the input signal. I demand that the signals themselves 
have no intrinsic phase: in other words, the statistics remain invariant under the transformation x a;e"^°, 
and y ye"''" , where (po is an arbitrary overall phase. It then follows that 

{xx) = {yy) = 0. (2) 

Prom these facts, one finds: 

(Re[a;]Re[x]) = (Im[x]Im[x]) = 1 (3) 
(Re[a;]Im[x]) = 0, 

and similarly for y. Thus, real and imaginary parts are drawn from Gaussian distributions with unit variance. 
The distributions are circular in the complex plane for both x and y. 

Without loss of generality, I assume that the normalized covariance of the signals p is purely real: 

„ _ {xy*) 



hM- (4) 
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Onc can always make p purely real by rotating x (or y) in the complex plane: x — > xe^'^" , and y ^ y. Note 
that, because of the absence of any intrinsic phase, 

(Rc[.x]RcM) = (Im[x]Im[y]) = p (5) 
(Re[a;]Im[y]) = (Im[x]Re[t/]) = 0, 

so that 

{xy) = 0. (6) 

In other words, the real parts of x and y arc correlated; and the imaginary parts of x and y are correlated; 
but real and imaginary parts are uncorrelated. In mathematical terms, real parts (or imaginary parts) are 
drawn from the bivariate Gaussian distribution: 



p) 



where {X, F) stands for either (Re[a;], Re[y]) or (Im[x], Im[y]). The distributions for real and imaginary parts 
are identical, but real and imaginary parts are uncorrelated. Therefore, 

P{x,y) = P2(ReN,Re[y])xP2(Ini[a;],Im[y]). (8) 



2.2. Correlation of Unquantized Signals 
2.2.1. Correlation 



Consider the product of two random complex signals, drawn from Gaussian distributions as in the 
previous section, sampled in time. Suppose that the signals arc not quantized: they can take on any 
complex value. The product of a pair of samples Cj = Xiy* does not follow a Gaussian distribution. Rather, 
the distribution of Cj is the product of an exponential of the real part, multiplied by the modified Bessel 
function of the second kind of order zero of the magnitude of Cj (Gwinn 2001). However, the average of 
many such products, averaged over a large number of pairs of samples, approaches a Gaussian distribution, 
as the central limit theorem implies. In such a large but finite sum. 




provides an estimate of the covariance p. Here the index i runs over the samples, commonly samples taken at 
different times. The total number of samples correlated is Nq. Henceforth I assume that in all summations, 
indices run from 1 to Nq. The subscript oo on the correlation r^o again indicates that the correlation has 
been formed for variables xt, yi that can take on any of an infinite number of values; in other words, it 
indicates that Xi and yi have not been quantized. 



2.2.2. Mean Correlation for Unquantized Signals 
The mean correlation is equal to the covariance, in a statistical average: 

(roo) = (Re[roo]) = ^ ((Re[a;i]Re[yi]) + (Im[a;i]Im[yi])) = p, 



(10) 
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where I used the assumption that the phase of p is zero, Eq. 5. This can also be seen from the mean of the 
distribution of the products Cj = Xiy^ (Gwinn 2001), or simply by integrating over the joint distribution of 
Xi and Ui, Eqs. 7 and 8. 



2.2.3. Noise for Correlation of Unquantized Signals 

Because the distribution of Too is Gaussian, the distribution of Too is completely characterized by its 
mean, Eq. 10, and by the variances (rocT^) and (roo^oo)- Suppose that the samples Xi and yi are independent; 
in mathematical terms, suppose that 

XiVj = 0, for i ^ j. (11) 

If they are not independent the results will depend on the correlations among samples; this case is important 
if, for example, the signal has significant spectral structure. Jenet & Anderson (1998) find the average 
spectrum in this case; I will discuss the noise in future work. Here I consider only independent samples. In 
that case, 

(j-ooO = -^Yl^Xiy*x*yj) (12) 



^{Xiy*x*yi) + Y.{xiy*){xjy*) 



4m 4m 

where I have separated the terms with i = j from those with i ^ j, and appealed to the facts that Xi and 
yj are covariant only ii i = j ("white" signals), and that for i = j the statistics are stationary in i. For 
Gaussian variables with zero mean a, 6, c, d, all moments are related to the second moments: 

{ahcd) = {ah) (cd) + (ac) (bd) + (ad) (be) , (13) 



so that: 



Therefore, 



An analogous calculation yields 
and so 



{Xiy*x*yi) = (2p)2+4. (14) 



(r-ooO = (15) 



{xiy*Xiy*) = 2(2p)2, (16) 



(r^>-(roo)^ = ^p' 
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I combine these facts to find the means and standard deviations of the real and imaginary parts of the 
measured correlation, roo- 

{Re[r^]) = (roo) =P (18) 

(Im[roo]) = 

(Re[roon - (Re[roor = + (^-roo)} - (roo)' = ^(1 + p') 

(ImM^) = |{(rooO-(rooroc)} =^^^^-P'^- 

If the number of independent samples Nq is large, the central limit theorem implies that Re[roo] and 
Im[roo] are drawn from Gaussian distributions. The means and variances of these distributions, as given 

by Eq. 18, completely characterize r^o- The fact that the real part of r^o has greater standard deviation 
than the imaginary part reflects the presence of self-noise or source-noise. Sometimes this is described as 
the contribution of the noiselike signal to the noise in the result. 

Commonly, and often realistically, astrophysicists suppose that p measures the intensity of one signal 
that has been superposed with two uncorrelated noise signals to produce x and y (see Bowers & Klingler 
(1974); Kulkarni (1989); Anantharamaiah et al. (1991)). A change in p then corresponds to a change in the 
intensities (|a;p) and (|yp) as well. Here, I suppose that {\x\'^) = (|j/p) = 1, while p varies. The results 
presented here can be scaled to those for the alternative interpretation. 



2.2.4- SNR for Correlation of Unquantized Signals 



The signal-to-noise ratio (SNR) for Re[roo] is: 



(Re[r„o]) 



V (Re[rooJ^) - (Re[roo])^ V 1 + P 

Note that for a given number of observations Nq, SNR increases with p\ the increase is proportional for 
p « 1. 

A related quantity to SNR is the rms phase, statistically averaged over many measurements. The phase 
is ^[roo] = tan~^(Im[roo]/Re[roo]). When the number of observations is sufficiently large, and the true value 

of the phase is 0, as assumed here, the standard deviation of the phase is {4>[rooY) = \/ (Im[roo]^) /(Re[roo]). 
The inverse of the standard deviation of the phase (in radians) is analogous to the SNR for the real part, 
Eq. 19. This SNR for phase is: 

7^oo(</'[roo]) = = V^^T^- ^20) 

-v/(Im[roo]^) Vl - P 

For constant Nq, the SNR of the phase increases with p, and increases much faster than proportionately for 



3. QUANTIZED SIGNALS 



Quantization converts the variables x, y to discrete variables x, y. These discrete variables depend on 
X and y through a multiple step function, known as a characteristic curve. Each step extends over some 
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range of the unquantized signal, and is given some weight rii in correlation. The function X{X) 

denotes the characteristic curve, where again X stands for either Re[a;] or Im[a;]. The complex quantized 
variable x is thus given by x = X(Re[x]) + iX{lm[x\). The same characteristic curve is applied to real and 
imaginary parts. I hold open the possibility that the characteristic curves X{X) and YiY) are different. I 
assume in this paper that the characteristic curves arc antisymmetric: X{—X) = —X(X), and similarly for 
Y. D'Addario et al. (1984) describe effects of departures from antisymmetry, and how antisymmetry can be 
enforced. Figure 1 shows a typical characteristic curve for 4-level (or 2-bit) sampling. 

Systems with M levels of quantization can be described by analogous, more complicated characteristic 
curves, and corresponding sets of weights {n,} and levels {vi^} and {viy} (Jenet & Anderson 1998). For 
M-level sampling, the correlation of Ng quantized samples is 

1 

= ^Y^x^y*. (21) 
^ 1=1 

In practical correlators, deviations from theoretical performance can often be expressed as deviations of 
the characteristic curve from its desired form. In principle, these can be measured by counting the numbers 
of samples in the various quantization ranges, and using the assumed Gaussian distributions of the input 
signals to determine the actual levels and Viy. D'Addario et al. (1984) present an extensive discussion of 
such errors, and techniques to control and correct them. 

Although the results presented below are applicable to more complicated systems, the 4-level correlator 
will be used as a specific example in this paper, with correlation r^. For 4-level sampling, commonly a sign 
bit gives the sign of X, and an amplitude bit assigns weight 1 if |X| is less than some threshold uoa;, and 
weight n if \X\ is greater than wok- Together, sign and amplitude bits describe the 4 values possible for 
X{X). Other types of correlators, including 2-level, 3-level, or "reduced" 4-level (in which case the smallest 
product, for \X\ = 1, |y| = 1, is ignored), can be formed as special cases or sums of four-level correlators 
(Hagen & Farley 1973). 

3.1. Correlation of Quantized Signals: Exact Results 

3.1.1. Mean Correlation: Exact Result 

Ideally, from measurement of of the quantized correlation tm one can estimate the true covariance p. 
The statistical mean of Tm is: 

i^'^) = ^12i^^y*) (22) 

^ i 

= i((Rc[.i]Rc[j/] + Im[.i]Im[?y]) + i{-Rc[x]lm[y] + lm[x]Rc[y])) . 

Because Re[i;] depends only on Re[a;] and Im[y] depends only on Im[y], and Re[a;] and Im[y] are completely 
independent (and similarly for Im[^] and Re[y]), 

(Re[i]Im[y]) = (Im[x]Rc[?y]) = 0. (23) 

Thus, the imaginary part of (f), which involves products of these statistically independent terms, has average 
zero (Eq. 3). For the real part, 

(Re[^]Re[y]) = (Im[^]Im[^]), (24) 
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where I use the assumption that the characteristic curves are identical for real and imaginary parts, and 
that real and imaginary parts of x and y have identical statistics (Eqs. 3, 5). I use the bivariate Gaus- 
sian distribution for real and imaginary parts to find a formal expression for the statistical average of the 
correlation: 

{tm) = (Re[i]Re[y]) = Txy = J dXdYP2{X,Y)X{X)Y{Y). (25) 

This integral defines Txy- For the assumed antisymmetric characteristic curves X{X) and Y{Y), one can 
easily show that ^^^^ > 0. In other words, the ensemble-averaged quantized correlation is an increasing 
function of the covariance p, for completely arbitrary quantizer settings (so long as the characteristic curves 

are antisymmetric). 

The discussion of this section reduces the calculation of the average quantized correlation to that of 
integrating P2 {X, Y) over each rectangle in a grid, with the edges of the rectangles given by the thresholds 
in the characteristic curves (Kokkeler, Pridman, & van Ardenne 2001). The function Txy depends on p 
through P2 {X, Y) . This function is usually expanded through first order in p, because p is small in most 
astrophysical observations. 



3.1.2. Simpler Form for T XY 

The integral Txy and similar integrals can be converted into one- dimensional integrals for easier anal- 
ysis. If one defines 

/•oo poo 

Q{vo.,voy)= dX dYP2{X,Y), (26) 

J Vox J Voy 

then the Fourier transform of P2{voxtVoy) is equal to that of dQ{vox, voY)/dp, as one finds from integration 
by parts. Thus, 

P P 

Q{vox,voy) = J dpiP2{vox,voY) + ler{c{^)eTfc{^). (27) 

The integral Txy is the sum of one such integral and one such constant for each step in the characteristic 

curve. This onc-dimcnsional form is useful for numerical evaluation and expansions. Kashlinsky, Hcrnandez- 
Monteagudo, & Atrio-Barandela (2001) also present an interesting expansion of Txy in Hermite polyno- 
mials, in vo- 



3.1.3. Noise: Exact Results 

This section presents an exact expression for the variance of the correlation of a quantized signal, when 
averaged over the ensemble of all statistically identical measurements. Real and imaginary parts of tm have 
different variances. This requires calculation of both (^m^m) ^^'^ {rMru)- Note that: 

{rMfli) = -^C^^Xiiilx^yj) (28) 

^ i i 



T?2 Y.^S:iX*yiy*) + Y.^Xiy*){x*yj) 



4m ' Am 

j^ixtXiy^yi) + {rM} 
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where the sum over i ^ j is simplified by the fact that samples at different times are uncorrelated, and by 
Eq. 22. I expand the first average in the last line: 

{Xix*yiy*) = (Re[fi]2Re[yi]2 + Im[xi]''lm[y,]-' + Re[^i]'lm[yi]2 + lm[xi]''Re[yif) (29) 
= (Re[f,]2Re[yi]2) + {Im[xi]Hm[yi]^) + (Re[^,]2)(Im[y,]2) + {lm[xif) {Re[yif) , 

where I have used the fact that the real part of x has zero covariance with the imaginary part of y, and vice 
versa. Because the real and imaginary parts are identical, this sum can be expressed formally in terms of 
the integrals: 

{Re[xifRe[yif) = {lm[xiflm[yif) = Tx2r2 = j dXdY P{X,Y) X{Xf Y{Yf, (30) 

{Re[xif) = (Im[ii]2) = Ax2= I dX-^e-l''" X{Xf 

J v27r 

{Re[yif) = {MVif) = Ay2^ j dY ^e-'^""' Y {Yf . 
These expressions defines T x2Y2, Ax2, and Ay2- Thus, 

{rufl,) = ^WYx2Y2 + ^Ax2Ay2-{rxYf} + {rxYf. (31) 

Note that in these expressions Ax2 and Ay2 are constants that depend on the characteristic curve, but not 
on p; whereas Txy and Tx2Y2 depend on p in complicated ways, as well as on the characteristic curve. 

Similarly, 

{fMfM) = -^^{xiTiXiy*) + ^ {tm?- (32) 
I again expand the first sum in the last line: 

{xiy*Xiy*} = 2Tx2Y2-2Ax2Ay2+^'T\y- (33) 
where I omit the imaginary terms, all of which average to zero. Therefore: 

{rurM) = ^{\'rX2Y2-\Ax2AY2) + {TxYf. (34) 

Using the same logic as in the derivation of Eq. 18, Eqs. 31 and 34 can be used to find the means and 
standard deviations of the real and imaginary parts of Tm '■ 

(Re[fM]) = Txy (35) 
(Im[fM]) = 

(ReM^) - (ReM)' = ^^{Tx2Y2-{TxYf) 

{MfM?) = -l^^{Ax2AY2-{TxYf). 



Again, note that Ax2 and Ay2 are constants that depend on the characteristic curve, but not on the 
covariance p, whereas Txy and 'Tx2Y2 depend on the actual value of p as well as the characteristic curve. 
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For particular characteristic curves, and particular values of p, these expressions nevertheless yield the mean 
correlation, and the standard deviations of real and imaginary parts about the mean. Figures 2 through 5 
show examples, and compare them with the approximate results from the following section. 

Note that, because Txy is an increasing function of p, the standard deviation of the imaginary part 
decreases with increasing covariance p. This holds for arbitrary quantizer parameters, so long as the char- 
acteristic curves are antisymmetric. In other words, the noise in the imaginary part always decreases when 
the correlation increases. 

For uncorrelated signals, p = 0. One finds then that Txy = and Tx2Y2 = ■^X2Ay2- In this case both 
real and imaginary parts have identical variances, as they must: 

{Re[rM?) - {Re[fM]f = {MrM?) = ^^X2^r2, for p = 0. (36) 

This recovers the result of Cooper (1970) and others for the noise. 

If the characteristic curves are identical, so that: X(X) = Y{Y), then if the signals are identical: p = 1, 
one finds that Tx2Y2 — Ax2 — A.Y2- Under these assumptions then (Im[fM]^) = 0. 



3.2. Correlation of Quantized Signals: Approximate Results 

3.2.1. Mean Correlation: Approximate Result 

Unfortunately the expressions for the mean correlation and the noise, for quantized signals, both depend 
in a complicated way on the covariance p, the quantity one seeks to measure. Often the covariance p is small. 
Various authors discuss the correlation vm of quantized signals x and y to first order in p, as is appropriate 
in the limit p ^ 0. (Van Vleck & Middleton 1966; Cooper 1970; Hagen & Farley 1973; Thompson, Moran, & 
Swenson 1986; Jenet & Anderson 1998). Jenet & Anderson (1998) also calculate (r4) for p = 1; as they point 
out, this case is important for autocorrelation spectroscopy. DAddario et al. (1984) present an expression 
for (f) for a 3-level correlator, and present several useful approximate expressions for the inverse relationship 
piifAi))- Here I find the mean correlation (f^/) through fourth order in p. 

For small covariance p, one can expand P{X, Y) in Eq. 7 as a power series in p: 

P{X,Y) = ] e^p\ ~\ {X' + Y'-2pXY)\ (37) 



1 _iv2 _ly2 

— e 2-^ e 2^ 
27r 



2 



l+XYp+^{l-X^){l-Y^)p 
+ ^{3X - X^){3Y -Y^)p^ 

+ —(3 - 6X^ + X*)(3 - 6Y^ + y*)p^.. 

24 . 

Note that the coefficient of each term in this expansion over p can be separated into two factors that depend 
on either X alone or Y alone. The extension to higher powers of p is straightforward, and the higher-order 
coefficients have this property as well. 

As noted above, I assume that the characteristic curve is antisymmetric: X{—X) = —X{X). The 
integral Txy (Eq. 25) involves first powers of the functions X{X) and Y{Y). In this integral, only terms 
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odd in both X and Y match the antisymmetry of the characteristic curve, and yield a nonzero result. Such 
terms are also odd in p, as is seen from inspection of Eq. 37. The first-order terms thus involve the integrals: 



Bx 
By 



dX 



1 1 V2 , 

e"2^ X{X)X 



27r 



(38) 



IdY^e-h^ 

J 



Y{Y)Y, 



where again X and Y can stand for either real or imaginary parts of x and y. Here, I consider terms up to 
order 3 in p. One thus encounters the further integrals: 



Dx = 



27r 



and the analogous expression for Dy- Therefore, through fourth order in p: 



{t-m) = Txr 



/ 



dX dY ^e-^^\-^^^ X{X)Y{Y) 

{XY)p +1{3X- X^){3Y - y3)p3 + 
6 



(39) 



(40) 



« BxBYP+-{3Bx-Dx)i3BY-DY)p^ + .... 

For thresholds ~ 1 the linear approximation is quite accurate (Cooper 1970; Thompson, Moran, & Swenson 
1986; Jenet & Anderson 1998); however, for other values of Vq the higher-order terms can become important. 
Our notation differs from that of previous authors; our BxBy is equal to [(n — 1)E + 1] of Cooper (1970) 
and Thompson, Moran, & Swenson (1986). It is equal to the of Jenet & Anderson (1998). 

Figure 2 shows typical results of the expansion of Eq. 40, for a 4-level correlator, and compares this 
estimate for 'ri{p) with the results from direct integration of Eq. 25 over rectangles in the x — y plane. In 
this example, vqx = voy = vq. For both vo = I and vq = 0.602, {tm) is relatively flat, with a sharp upturn 
very close to p = 1. However, in both cases, but especially for vq = 0.602, the curve of f4 bends upward well 
before p = 1, so that the linear approximation is good only for relatively small p. 



3.2.2. Noise: Approximate Results 

Expressions for the noise in the integral involve the integral Tx2Y2 (Eq. 35). This integral involves 
only the squares of the characteristic curves X{X)'^ and Y{Y)^. Because the characteristic curves are 
antisymmetric about 0, their squares are symmetric: X{X)'^ = X{—X)'^ and Y(Y)'^ = Y{—Y)'^. Therefore, 
the only contributions come from terms in the expansion of P{x, y) (Eq. 37) that are even in X and Y. One 
thus encounters the integrals: 

Xf (41) 

Cx2 = I dX X{X)^X^, 



E 



X2 



dX - 


1 


2^ 




v/2^ 




dX - 


1 


2-^ 








dX - 


1 


2^ 




v/2^ 
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and analogously for Ay2, Cy2, and -By 2- Then, through fourth order in p, 

dX dY P{X,Y) X{X fY{Y)^ (42) 



■X2Y2 



I 



dX dY ^e-^^\-^^^ X{Xf Y{Yf 



X [1 + i(l - X^){1 - Y^)p'' + ^(3 - ^ -^4)(3 _ g^2 ^ Y^)p\.] 

« Ax2Ay2 + \{Ax2 - Cx2){Ay2 - Cy2)p' + ^(3^X2 - &Cx2 + Ex2){2,Ay2 - QCy2 + Ey2)p'.... 

I find the standard deviations of real and imaginary parts from Eqs. 35, 40, and 42: 

(ReM') - (Re[fM])' ^ ^^{{Ax2Ay2} (43) 

+ {\{AX2 - Cx2){Ay2 - Cy2) - B%Bl]p'' 

+ {^(3^;f2 - 6Cx2 + Ex2){3Ay2 - 6Cy2 + Ey2) 
- ^Bx{3Bx - Dx)By{SBy - £>r)}p^..) 

(Im[fM]'> « ^({Ax2Ay2} 



{b%B'y} 

[^BxiWx - Dx)By{Wy - Dy)} 



I have used the fact that {"Tm) is purely real; this is a consequence of the assumption that p is purely real. 

Figure 3 shows examples of the standard deviations of Re[f4] and Im[f4] for 2 choices of vq. These are 
the noise in estimates of the correlation. Note that the noise varies with p. The quadratic variation of these 
quantities with p is readily apparent. The higher-order variation is more subtle, although it does lead to an 
upturn of the standard deviation of Re[f4] near p w 0.7, for vq = 1. The series expansions become inaccurate 
near p = 1, as expected. The standard deviation of Re[f4] can also increase, instead of decrease, for large 
p. Such an increase is more common for parameter choices with vq > 1. Again, note that the standard 
deviation of the imaginary part always decreases with increasing p. 



3.2.3. SNR for Quantized Correlation 

The signal-to-noisc ratio (SNR) for a quantizing correlator is the quotient of the mean and variance of 
fjvf: tlic results of § 3.2.1 and 3.2.2. I recover the results of Cooper (1970) for the SNR for a quantizing 
correlator, by using our approximate expressions through first order in p (see also Hagen & Farley (1973), 
Thompson, Moran, & Swenson (1986), Jenet & Anderson (1998)): 

V(Re[rM]^) - (Re[rM])^ \/Ax2Ay2 

For a 4-level correlator, a SNR of TZ^ = 0.88115 x p is attained for n = 3, vq=1, in the limit p — > 0. 
Many 4-level correlators use these values. The maximum value for TZ^ is actually obtained for n = 3.3359, 
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vq = 0.9815, for which TZ^ = 0.88252 x p. This adjustment of quantization constants provides a very minor 
improvement in SNR. 

For nonvanishing p, the optimum level settings depend upon the covariance p. The signal-to-noise ratio 

is 

nM{Re[rM]) = V2N, (45) 

V '-X2Y2 — (^XY) 

This can be approximated using the expansions for Txv and Tx2Y2', Figure 4 shows the results. Note 
that in the examples in the figure, the SNR for the quantized correlations actually curve above that for the 
unquantized correlation beyond p « 0.5; this indicates that correlation of quantized signals can actually yield 
higher signal-to-noise ratio than would be obtained from correlating the same signals before quantization. 
This results from the decline in noise with increasing p visible in Fig. 3. 

For a proper comparison of SNRs, one must compare with the SNR obtained for non-quantized corre- 
lation, TZoo{Re[fM]), Eq. 19. One finds: 



7^M(Re[rM]) ^ Tjcy \/l -p^ .^g. 

TlooiReifM]) ^rx2Y2 - (TxY? P ' 

Figure 5 shows this ratio for 2 choices of Vq. The ratio can exceed 1, again indicating that quantized 
correlation provides a more accurate result than would correlation of an unquantized signal. 

The SNR for a measurement of phase for quantized correlation is the inverse of the standard deviation 
of the phase, as discussed in § 19. For quantized correlation, this is 

'^^'^^''^^ v/(Im[rM]2) ^ ^^AxAy-{JxyY ^ ' 

Again, because Txy always increases with />, the SNR of the phase always increases with increasing covari- 
ance. 

The ratio of the SNR for phase to that for correlation of an unquantized signal, 7?.m(0[?^m])/'7^cx)(0[^'oo]) 
(see Eq. 20), provides an interesting comparison. For p — > 0, the statistics for the imaginary part of the 
correlation are identical to those for the real part (as they must be), and the highest SNR for the phase is 
given by the quantizer parameters that are optimal for the real part, traditionally uq = 1, n = 3. This ratio 
is approximately constant with p up to p w 0.6, and then decreases rather rapidly. Simulations suggest that 
quantized correlation is less eSicient than unquantized for measuring phase; however I have not proved this 
in general. 



4. SIMULATIONS 

Simulation of a 4-level correlator provides a useful perspective. I simulated such a correlator by gener- 
ating two sequences of random, complex mimbers, Xi and t/, . The real parts of Xi and tji are drawn from one 
bivariate Gaussian distribution, and their imaginary parts from another independent one (see Eq. 8). These 
bivariate Gaussian distributions can be described equivalently as elliptical Gaussian distributions, with ma- 
jor and minor axes inclined to the coordinate axes X = Refxj] and Y = Re[t/,] and the corresponding axes 
for the imaginary parts. 
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Meyer (1975) gives expressions that relate the semimajor and semiminor axes and angle of inclination of 
an elliptical Gaussian distribution to the normalized covariance p and variances a\ and erf. . For the special 
case of ax = cry = ^ used in this work, the major axis always lies at angle 7r/4 to both coordinate axes, 
along the line X = Y. The semimajor axis bi and semiminor axis 62 are then given by 

h = VTT^, (48) 

&2 = Vl - P- 

To form the required elliptical distribTitions. I drew pairs of elements from a circular Gaussian distribu- 
tion, using the Box-MuUer method (see Press et al. (1989)). I scaled these random elements so that their 
standard deviations were 61 and 62. I then rotated the resulting 2-element vector by 7r/4 to express the 
results in terms of X and Y. I repeated the procedure for the imaginary part. 

I quantized the sequences Xi and according to the 4-level characteristic curve shown in Figure 1, to 
yield Xj and xji . Both the unquantizcd and the quantized sequences were correlated by forming the products 
XiU^ and Xiy*, respectively, and results were averaged over Nq = 10^ instances of the index i. This procedure 
yields one realization each of r and f4. I found that values for Nq smaller than about 10^ could produce 
significant departures from Gaussian statistics for fi, particularly for larger values of p. 

I repeated the process to obtain 4096 different realizations of r and fi. I found the averages and 

standard deviations for the real and imaginary parts for this set of realizations. Figures 2 through 5 show 
these statistical results of the simulations, and compare them with the mathematical results of the preceding 
sections. Clearly, the agreement is good. 

In graphical form, samples of the correlation form an elliptical Gaussian distribution in the complex 
plane, centered at the mean value of correlation {r^) or {tm), as the case may be. The principal axes of the 

distribution lie along the real and imaginary directions (or, more generally, the directions in phase with p 
and out of phase with p). The lengths of these principal axes are the variances of real and imaginary parts. 

5. DISCUSSION AND SUMMARY 

5.1. Change of Noise with Covariance 

The fundamental result of this paper is that a change in covariance p affects quantized correlation 
fjvf differently from unquantized correlation roo- For unquantized correlation, an increase in covariance p 
increases noise for estimation of signal amplitude. For quantized correlation, an increase in p can increase 
or decrease amplitude noise. For both quantized and unquantized correlation, an increase in p leads to a 
decrease in phase noise. In this work, I arbitrarily set the phase of p to 0, so that amplitude corresponds to 
the real part, and phase to the imaginary part of the estimated correlation. The net noise (summed, squared 
standard deviations of real and imaginary parts) can decrease (or increase) with increasing p: noise is not 
conserved. 

I present expressions for the noise as a function of quantization parameters, both as exact expressions 
that depend on p and on power-series expansions in p. These expressions, and a power-series expansion for 
the mean correlation, are given through fourth order in p. 

The increase in noise with covariance p for analog correlation, sometimes called source noise or self- 
noise, is sometimes ascribed to the contribution of the original, noiselike signal to the noise of correlation. 
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This idea is difficult to generalize to comparisons among quantized signals, because such comparisons require 
additional assumptions about changes in quantizer levels and the magnitude of the quantized signal when 
the covariance changes. These comparisons are simpler for multiple correlations derived from a single signal 
(as, for example, for the correlation function of a spectrally- varying signal), and I will discuss them in that 
context elsewhere (Gwinn 2003). The discussion in this paper is limited to "white" signals, without spectral 
variation and with only a single independent covariance. 

5.1.1. Increase in SNR via Quantization 

One interesting consequence of these results is that signal-to-noise ratio of correlation can actually be 
greater for quantized signals, than it would be for correlation of the same signals before quantization. At small 
covariance p, SNR is always lower for quantized signals, but this need not be the case for covariance p ^ 0.4. 
This appears to present a paradox, because the process of quantization intrinsically destroys information: 
the quantized signals Xi, yi contain less information than did the original signals Xi, yi. However, correlation 
of unquantized signals also destroys information: it converts Xi and y^ to the single quantity Cj = {xiy*). 
Different information is destroyed in the two cases. 

Moreover, correlation does not always yield the most accurate estimate of the covariance p. As a simple 
example, consider the series {Xi} = {0.400, -0.800, 1.600} and {Yi} = {0.401, -0.799, 1.600}. Here, TV, = 3. 
One easily sees that X and Y are highly correlated. If X and Y are known to be drawn from Gaussian 
distributions with imit standard deviation, Eq. 48 suggests that p « 0.999. However, the correlation is 
r = 5 J2i ^i^i = 1-29. Clearly r is not an optimal measurement of p. I will discuss strategies for optimal 
estimates of covariance elsewhere (Gwinn 2003). 

5.1.2. Quantization Noise 

Sometimes effects of quantization are described as "quantization noise" : an additional source of noise 
that (like "sky noise" or "receiver noise") reduces the correlation of the desired signal. However, unlike other 
sources of noise, quantization destroys information in the signals, rather than adding unwanted information. 
The discussion of the preceding section suggests that the amount of information that quantization destroys 
(or, more loosely, the noise that it adds) depends on what information is desired; and that correlation removes 
information as well. Unless the covariance p is small, effects of quantization cannot be represented as a one 
additional, independent source of noise, in general. 

5.1.3. Applications 

The primary result of this paper is that for quantized correlation, noise can increase or decrease when 
covariance increases; whereas for continuous signals it increases. This fact is important for applications 
requiring accurate knowledge of the noise level; as for example in studies of rapidly-varying strong sources 
such as pulsars, where one wishes to know whether a change in correlation represents a significant change in 

the pulsar's emission; or for single-dish or interferometric observations of intra-day variable sources, where 
one wishes to know whether features that may appear and disappear are statistically significant. 

A second result of this paper is that the signal-to-noise ratio for quantized correlation can be quite 
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different from that expected for a continuum source, or for a continuum source with added noise. This 
effect is most important for large correlation, p ^ 0.5. Correlation this large is often observed for strong 
sources, such as the strongest pulsars, or maser lines; and for the strongest continuum sources observed with 
sensitive antennas. For example, at Arecibo Observatory a strong continuum source easily dominates the 
system temperature, at many observing frequencies. The effect will be even more common for some proposed 
designs of the Square Kilometer Array. 

Many sources show high correlation that varies dramatically with frequency; such sources include scin- 
tillating pulsars, and maser lines. Typically observations of these source involve determination of the full 
correlation function, and a Fourier transform to obtain an estimated cross-power or autocorrelation spectrum. 
I discuss the properties of noise for this analysis elsewhere. 

5.1.4. SNR Enhancement? 

An interesting question is whether one can take advantage of the higher SNR afforded by quantization at 
high p even for weakly-correlated signals, perhaps by adding an identical signal to each of 2 weakly covariant 
signals and so increasing their covariance p, before quantizing them. The answer appears to be "no" . As a 
simple example, consider a pair of signals with covariance of p « 0.01. After correlation of A^q = 2 x 10^ 
instances of the signal, using a 4-level correlator with vq = 1 and n = 3, the SNR is 20, and one can determine 
at a level of 2 standard deviations whether p = 0.010 or p = 0.011. If a single signal, with 4.6 times greater 
amplitude, is added to both of the original signals, then these 2 cases correspond to p = 0.7004 or p = 0.7005. 
To distinguish them at 2 standard deviations requires a SNR of 1400, requiring Ag = 4 x 10^ samples of the 
quantized correlation. Thus, the increase in SNR is more than outweighed by the reduction in the influence 
on the observable. 



5.2. Summary 

In this paper, I consider the result of quantizing and correlating two complex noiselike signals, x and y 
with normalized covariance p. The signals are assumed to be statistically stationary, "white," and sampled 
at the Nyquist rate. The correlation r provides a measurement of p. The variation of r about that mean, 
characterized by its standard deviation, provides a measure of the random part of the measurement, or noise. 

I suppose that the signals x and y are quantized to form x and y. I suppose that the characteristic curves 
that govern quantization are antisymmetric, with real and imaginary parts subject to the same characteristic 
curve. I recover the classic results for the noise for p = 0, and for the mean correlation, to first order in p, 
in the hmit p ^ (Van Vleck & Middleton 1966; Cooper 1970; Hagen & Farley 1973; Thompson, Moran, 
& Swenson 1986). I find exact expressions for the mean correlation and the noise, and approximations valid 
through fourth order in p. I compare results with simulations. Agreement is excellent for the exact forms, 
and good for p not too close to 1, for the approximate expressions. 

I find that for nonzero values of p, the noise varies, initially quadratically, with p. I find that the noise 
in an estimate of the amplitude of p can decrease with increasing p; this is opposite the behavior of noise for 
correlation of unquantizcd signals, for which noise always increases with p. The mean correlation can increase 
more rapidly than linearly with p. The signal-to-noisc ratio (SNR) for correlation of quantized signals can be 
greater than that for correlation of unquantized signals, for p > 0.5. In other words, correlation of quantized 
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signals can be more efficient than correlation of the same signals before quantization, as a way of determining 
the covariance p. 

I am grateful to the DRAO for supporting this work with extensive correlator time. I also gratefully 
acknowledge the VSOP Project, which is led by the Japanese Institute of Space and Astronautical Science in 
cooperation with many organizations and radio telescopes around the world. I thank an anonymous referee 
for useful comments. The U.S. National Science Foundation provided financial support. 
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Fig. 1. — Characteristic curve XiX) for 4-level quantization. 
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Fig. 2. — Average correlation plotted with covariance p. Curves show {r^o), and approximations to (fi) 
for vq = 1-0 and vq = 0.602, with n = 3. Heavy lines show the third-order approximation of Eq. 40. and 
light lines show the linear approximation of Cooper (1970). Circles show true values as computed by direct 
integration of Eq. 25. Crosses show results of simulations (§4). The sharp rise in (f4) for p ~ 1, shown by the 
circles at far right, motivates the approximation of Jenet & Anderson (1998) that (f4) varies proportionately 
with p for p < 1, with a spike at p = 1. 
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Fig. 3. — Standard deviation of the real part of correlation (left panel), and of the imaginary part (right 
panel), plotted with covariance p. Both are normalized for the number of samples by the factor yj2Nq. 
Curves show (foo), and approximations to {fi) for quantization with vq = 1.0 and vq = 0.602, with n = 3. 
Heavy lines show the fourth-order approximation of Eq. 43, and light lines show the approximation to second 
order. The values found by Cooper (1970) are the y-intercepts (p = 0). Filled circles show true values as 
computed by direct integration of Eq. 35. Crosses show results of simulations (§4). 
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Fig. 4. — Signal-to-noise ratio 7?.(x> or 7?.4 normalized for number of samples by l/-^/2Nq and plotted with 
covariance p. Curves mark the exact expression given by Eq. 19 for T^oc, or the approximate expression given 
by the expansions of Eqs. 40 and 43 through fourth order in Eq. 45. Dots give exact values as found from 
direct integration of Eq. 35 in Eq. 45. Crosses show results of computer simulations in § 4. Quantization 
uses the characteristic curve in Fig. 1 with values of vq = 1.0 or vq = 0.602, and n = 3. 
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Fig. 5. — SNR for quantized signals, normalized to SNR of the unquantized signal: TZa/T^oo^ plotted with 
covariance p. Curves mark the approximate expressions given by Eqs. 40 and 43 for TZ^, normalized by TZ^ 
as given by Eq. 18. Dots give exact expressions as found from direct integration of Eq. 25. The y-intercept 
for vq = 1.0 is the standard SNR for a 4-level correlator, TZi = 0.88115, with vq = 1.0 and n = 3, which is 
optimal for p = 0. Note that the ratio can be greater than 1; this indicates that quantized correlation can 
be more efficient than correlation of the original, unquantized signals. 



